AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Neural Information Processing SystemsOct-2-2025, 17:41:56 GMT

3cc697419ea18cc98d525999665cb94a-Supplemental.pdf

artificial intelligence, machine learning, neurise, (18 more...)

Country: North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-19-2025, 08:38:47 GMT

A Proofs A.1 Proof of Proposition 4.1 Proof

The first lemma is Lemma 3 in [24]. Hermitian matrix and let B be a Hermitian perturbation. To apply Lemma A.1, we must study the relationship between minimum eigenvalue gap of By Lemma A.2, we have (p 1) (p 1) ( p 1) Then, by the proof of Theorem 5.2, we have null null null l null x, y, null H (p 1) (p 1) ( p 1) (p 1) ( p 1) (p 1) (p 1) (p 1) ( p 1) ( p 1) ( p 1) (p 1) (p 1) (p 1) (p 1) (p 1) A.5 The Optimization of SimpleMKKM SimpleMKKM aims to solve the following kernel alignment-based optimization problem: min Assume that the number of iterations is T . Table 4: Large-scale datasets used in the experiments Dataset Samples View Clusters NUSWIDE 30000 5 31 A wA 30475 6 50 MNIST 60000 3 10 YtVideo 101499 5 31 B.2 Clustering Performance with Different Numbers of Landmarks As seen, as the number of landmarks increases, the ACC of the proposed method is approaching SimpleMKKM, and tends to be stable. It shows that we don't need too many landmarks To verify the assumptions about the eigenvalues of the empirical kernel matrix in Theorem 5.2, we To give more empirical studies of the proposed method, we conduct additional experiments on three classic algorithms, i.e., average multiple kernel The results are reported in the following three tables.

artificial intelligence, machine learning, optimization problem, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Neural Information Processing SystemsFeb-11-2025, 17:54:51 GMT

DUOL: A Double Updating Approach for Online Learning

Peilin Zhao, Steven C. Hoi, Rong Jin

In most online learning algorithms, the weights assigned to the misclassified examples (or support vectors) remain unchanged during the entire learning process. This is clearly insufficient since when a new misclassified example is added to the pool of support vectors, we generally expect it to affect the weights for the existing support vectors. In this paper, we propose a new online learning method, termed Double Updating Online Learning, or DUOL for short. Instead of only assigning a fixed weight to the misclassified example received in current trial, the proposed online learning algorithm also tries to update the weight for one of the existing support vectors. We show that the mistake bound can be significantly improved by the proposed online learning method. Encouraging experimental results show that the proposed technique is in general considerably more effective than the state-of-the-art online learning algorithms.

algorithm, artificial intelligence, machine learning, (16 more...)

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceFeb-10-2025

Detecting Backdoor Samples in Contrastive Language Image Pretraining

Huang, Hanxun, Erfani, Sarah, Li, Yige, Ma, Xingjun, Bailey, James

Contrastive language-image pretraining (CLIP) has been found to be vulnerable to poisoning backdoor attacks where the adversary can achieve an almost perfect attack success rate on CLIP models by poisoning only 0.01\% of the training dataset. This raises security concerns on the current practice of pretraining large-scale models on unscrutinized web data using CLIP. In this work, we analyze the representations of backdoor-poisoned samples learned by CLIP models and find that they exhibit unique characteristics in their local subspace, i.e., their local neighborhoods are far more sparse than that of clean samples. Based on this finding, we conduct a systematic study on detecting CLIP backdoor attacks and show that these attacks can be easily and efficiently detected by traditional density ratio-based local outlier detectors, whereas existing backdoor sample detection methods fail. Our experiments also reveal that an unintentional backdoor already exists in the original CC3M dataset and has been trained into a popular open-source model released by OpenCLIP. Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs. The code is publicly available in our \href{https://github.com/HanxunH/Detect-CLIP-Backdoor-Samples}{GitHub repository}.

artificial intelligence, data mining, machine learning, (17 more...)

2502.01385

Country:

Asia > China (0.04)
Asia > Singapore (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Nepal (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Němeček, Jiří, Kozdoba, Mark, Kryvoviaz, Illia, Pevný, Tomáš, Mareček, Jakub

Bias Detection via Maximum Subgroup Discrepancy

arXiv.org Machine LearningFeb-4-2025

Bias evaluation is fundamental to trustworthy AI, both in terms of checking data quality and in terms of checking the outputs of AI systems. In testing data quality, for example, one may study a distance of a given dataset, viewed as a distribution, to a given ground-truth reference dataset. However, classical metrics, such as the Total Variation and the Wasserstein distances, are known to have high sample complexities and, therefore, may fail to provide meaningful distinction in many practical scenarios. In this paper, we propose a new notion of distance, the Maximum Subgroup Discrepancy (MSD). In this metric, two distributions are close if, roughly, discrepancies are low for all feature subgroups. While the number of subgroups may be exponential, we show that the sample complexity is linear in the number of features, thus making it feasible for practical applications. Moreover, we provide a practical algorithm for the evaluation of the distance, based on Mixed-integer optimization (MIO). We also note that the proposed distance is easily interpretable, thus providing clearer paths to fixing the biases once they have been identified. It also provides guarantees for all subgroups. Finally, we empirically evaluate, compare with other metrics, and demonstrate the above properties of MSD on real-world datasets.

artificial intelligence, machine learning, subgroup, (14 more...)

arXiv.org Machine Learning

2502.02221

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Mississippi (0.05)
North America > United States > Maine (0.05)
(11 more...)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Shen, Yunyi, Sun, Hao, Ton, Jean-François

Reviving The Classics: Active Reward Modeling in Large Language Model Alignment

arXiv.org Artificial IntelligenceFeb-4-2025

Building neural reward models from human preferences is a pivotal component in reinforcement learning from human feedback (RLHF) and large language model alignment research. Given the scarcity and high cost of human annotation, how to select the most informative pairs to annotate is an essential yet challenging open problem. In this work, we highlight the insight that an ideal comparison dataset for reward modeling should balance exploration of the representation space and make informative comparisons between pairs with moderate reward differences. Technically, challenges arise in quantifying the two objectives and efficiently prioritizing the comparisons to be annotated. To address this, we propose the Fisher information-based selection strategies, adapt theories from the classical experimental design literature, and apply them to the final linear layer of the deep neural network-based reward modeling tasks. Empirically, our method demonstrates remarkable performance, high computational efficiency, and stability compared to other selection methods from deep learning and classical statistical literature across multiple open-source LLMs and datasets. Further ablation studies reveal that incorporating cross-prompt comparisons in active reward modeling significantly enhances labeling efficiency, shedding light on the potential for improved annotation strategies in RLHF.

artificial intelligence, machine learning, natural language, (15 more...)

2502.04354

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Purohit, Mirali, Muhawenayo, Gedeon, Rolf, Esther, Kerner, Hannah

How Does the Spatial Distribution of Pre-training Data Affect Geospatial Foundation Models?

arXiv.org Artificial IntelligenceJan-21-2025

Foundation models have made rapid advances in many domains including Earth observation, where Geospatial Foundation Models (GFMs) can help address global challenges such as climate change, agriculture, and disaster response. Previous work on GFMs focused on tailoring model architecture and pre-text tasks, and did not investigate the impact of pre-training data selection on model performance. However, recent works from other domains show that the pre-training data distribution is an important factor influencing the performance of the foundation models. With this motivation, our research explores how the geographic distribution of pre-training data affects the performance of GFMs. We evaluated several pre-training data distributions by sampling different compositions from a global data pool. Our experiments with two GFMs on downstream tasks indicate that balanced and globally representative data compositions often outperform region-specific sampling, highlighting the importance of diversity and global coverage in pre-training data. Our results suggest that the most appropriate data sampling technique may depend on the specific GFM architecture. These findings will support the development of robust GFMs by incorporating quality pre-training data distributions, ultimately improving machine learning solutions for Earth observation.

artificial intelligence, cropharvest, machine learning, (14 more...)

2501.12535

Country:

South America (0.04)
Oceania (0.04)
Europe (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceNov-16-2024

ML$^2$Tuner: Efficient Code Tuning via Multi-Level Machine Learning Models

Cha, JooHyoung, Lee, Munyoung, Kwon, Jinse, Lee, Jubin, Lee, Jemin, Kwon, Yongin

The increasing complexity of deep learning models necessitates specialized hardware and software optimizations, particularly for deep learning accelerators. Existing autotuning methods often suffer from prolonged tuning times due to profiling invalid configurations, which can cause runtime errors. We introduce ML$^2$Tuner, a multi-level machine learning tuning technique that enhances autotuning efficiency by incorporating a validity prediction model to filter out invalid configurations and an advanced performance prediction model utilizing hidden features from the compilation process. Experimental results on an extended VTA accelerator demonstrate that ML$^2$Tuner achieves equivalent performance improvements using only 12.3% of the samples required with a similar approach as TVM and reduces invalid profiling attempts by an average of 60.8%, Highlighting its potential to enhance autotuning performance by filtering out invalid configurations

artificial intelligence, configuration, machine learning, (17 more...)

2411.10764

Country: North America > United States > New York > New York County > New York City (0.05)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)